Search CORE

3,403 research outputs found

A new SVD approach to optimal topic estimation

Author: Ke Zheng Tracy
Wang Minzhe
Publication venue
Publication date: 03/07/2019
Field of study

In the probabilistic topic models, the quantity of interest---a low-rank matrix consisting of topic vectors---is hidden in the text corpus matrix, masked by noise, and Singular Value Decomposition (SVD) is a potentially useful tool for learning such a matrix. However, different rows and columns of the matrix are usually in very different scales and the connection between this matrix and the singular vectors of the text corpus matrix are usually complicated and hard to spell out, so how to use SVD for learning topic models faces challenges. We overcome the challenges by introducing a proper Pre-SVD normalization of the text corpus matrix and a proper column-wise scaling for the matrix of interest, and by revealing a surprising Post-SVD low-dimensional {\it simplex} structure. The simplex structure, together with the Pre-SVD normalization and column-wise scaling, allows us to conveniently reconstruct the matrix of interest, and motivates a new SVD-based approach to learning topic models. We show that under the popular probabilistic topic model \citep{hofmann1999}, our method has a faster rate of convergence than existing methods in a wide variety of cases. In particular, for cases where documents are long or

n

is much larger than

p

, our method achieves the optimal rate. At the heart of the proofs is a tight element-wise bound on singular vectors of a multinomially distributed data matrix, which do not exist in literature and we have to derive by ourself. We have applied our method to two data sets, Associated Process (AP) and Statistics Literature Abstract (SLA), with encouraging results. In particular, there is a clear simplex structure associated with the SVD of the data matrices, which largely validates our discovery.Comment: 73 pages, 8 figures, 6 tables; considered two different VH algorithm, OVH and GVH, and provided theoretical analysis for each algorithm; re-organized upper bound theory part; added the subsection of comparing error rate with other existing methods; provided another improved version of error analysis through Bernstein inequality for martingale

arXiv.org e-Print Archive

Orthonormal Polynomials on the Unit Circle and Spatially Discrete Painlev\'e II Equation

Author: Chie Bing Wang
Fokas A S
Hastings S P
Hisakado M
Jimbo M
Szegö G
Tracy C A
Tracy C A
Tracy C A
Tracy C A
Publication venue: 'IOP Publishing'
Publication date: 01/01/1999
Field of study

We consider the polynomials

\phi_n(z)= \kappa_n (z^n+ b_{n-1} z^{n-1}+ >...)

orthonormal with respect to the weight

\exp(\sqrt{\lambda} (z+ 1/z)) dz/2 \pi i z

on the unit circle in the complex plane. The leading coefficient

\kappa_n

is found to satisfy a difference-differential (spatially discrete) equation which is further proved to approach a third order differential equation by double scaling. The third order differential equation is equivalent to the Painlev\'e II equation. The leading coefficient and second leading coefficient of

\phi_n(z)

can be expressed asymptotically in terms of the Painlev\'e II function.Comment: 16 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Instructional strategies and teacher-student interaction in the classrooms of a Chinese immersion school

Author: Wang Tsueylin Tracy
Publication venue: USF Scholarship: a digital repository @ Gleeson Library | Geschke Center
Publication date: 01/01/2008
Field of study

unavailabl

University of San Francisco

Riemann-Hilbert approach to multi-time processes; the Airy and the Pearcey case

Author: Adler
Aptekarev
Bertola
Bertola
Bleher
Bleher
Bleher
Dyson
Harnad
Johansson
Johansson
M. Bertola
M. Cafasso
Okounkov
Prähofer
Simon
Soshnikov
Tracy
Tracy
Wang
Publication venue: 'Elsevier BV'
Publication date: 26/04/2011
Field of study

We prove that matrix Fredholm determinants related to multi-time processes can be expressed in terms of determinants of integrable kernels \`a la Its-Izergin-Korepin-Slavnov (IIKS) and hence related to suitable Riemann-Hilbert problems, thus extending the known results for the single-time case. We focus on the Airy and Pearcey processes. As an example of applications we re-deduce a third order PDE, found by Adler and van Moerbeke, for the two-time Airy process.Comment: 18 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Concordia University Research Repository

HAL Descartes

Sissa Digital Library

Okina

Hal-Diderot

Application Testing Under Developer Specified Device Resource Occupancy

Author: Dai Wendy
Wang Tracy
Publication venue: Technical Disclosure Commons
Publication date: 18/07/2023
Field of study

During normal usage, consumer devices may remain switched on without a shutdown and restart for long durations of time. A lengthy period of time since the last restart can lead to high usage of device resources such as CPU, memory, storage, etc. Program performance issues as well as errors caused by these are hard to detect using clean functional test environments. This disclosure describes techniques to emulate end-user scenarios as lengthy times since last restart and high resource utilization by providing the developer with the ability to easily configure the usage of the CPU, memory, and storage of a device-under-test (DUT) via a device resources management tool. The device resources management tool is implemented such that it can invoke low level operating system APIs to occupy a specified percentage of resources such as CPU, memory, storage, etc. The extent to which each device resource is occupied can be set in an independent or combined manner. The device resources management tool enables developers to emulate various real world resource utilization scenarios and can help identify bugs that are otherwise rare and/or difficult to reproduce

Technical Disclosure Common

Virtual devices as a service

Author: Wang Tracy
Yazigi Rim
Publication venue: Technical Disclosure Commons
Publication date: 24/07/2019
Field of study

Software applications are developed and tested over a large and evolving variety of devices of different device types. Development and testing with physical devices is tedious and time consuming and has scaling and reliability problems. Per techniques of this disclosure, a large pool of virtual devices is instantiated on a compute cluster and made available to software developers as a service. Developers check out as many virtual devices as needed, conduct test and development activity, reset the devices, and release the devices back to the pool. The techniques obviate the need for physical devices and the concomitant issues of cost and reliability and enable large scale testing and development and faster device releases

Technical Disclosure Common

When corporate scandal hits retail investors close to home

Author: Giannetti Mariassunta
Yue Wang Tracy
Publication venue: London School of Economics and Political Science
Publication date: 02/02/2017
Field of study

People reduce their participation in the stock market after a case of corporate fraud in their state, write Mariassunta Giannetti and Tracy Yue Wan

LSE Research Online